Mandalay Region
Appendix A V ariational Paragraph Embedder A.1 Selection of substitution rate p
Figure 4: Impact of the proportion of injected noise for learning Paragraph Em-beddings on XSum dataset. (Figure 4). The results of the ablation study are presented in Table 5. Embedder in providing clean and denoised reconstructions. In general, it has been observed that generations progress in a coarse-to-fine manner. The early time step, which is close to 1, tends to be less fluent and generic. This was the nicest stay we have ever had. Turtle Bay was a great resort. This was the nicest stay we have ever had.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Oceania > Australia (0.04)
- North America > United States > Virginia (0.04)
- (12 more...)
What's happening in Myanmar's civil war as military holds elections?
What's happening in Myanmar's civil war as military holds elections? Voters in parts of Myanmar are heading to the polls on Sunday for an election that critics view as a bid by the country's generals to legitimise military rule, nearly five years after they overthrew the government of Nobel Laureate Aung San Suu Kyi. The multi-phased election is unfolding amid a raging civil war, with ethnic armed groups and opposition militias fighting the military for control of vast stretches of territory, stretching from the borderlands with Bangladesh and India in the west, across the central plains, to the frontiers with China and Thailand in the north and east. Another third will be covered during a second and third phase in January, while voting has been cancelled altogether in the remainder. Fighting, including air raids and arson, has intensified in several areas.
- North America > United States (0.65)
- South America (0.51)
- North America > Central America (0.41)
- (16 more...)
- Government > Military (1.00)
- Government > Regional Government (0.70)
Appendix A V ariational Paragraph Embedder A.1 Selection of substitution rate p
Figure 4: Impact of the proportion of injected noise for learning Paragraph Em-beddings on XSum dataset. (Figure 4). The results of the ablation study are presented in Table 5. Embedder in providing clean and denoised reconstructions. In general, it has been observed that generations progress in a coarse-to-fine manner. The early time step, which is close to 1, tends to be less fluent and generic. This was the nicest stay we have ever had. Turtle Bay was a great resort. This was the nicest stay we have ever had.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Oceania > Australia (0.04)
- North America > United States > Virginia (0.04)
- (12 more...)
Miniature soft robot with magnetically reprogrammable surgical functions
Ng, Chelsea Shan Xian, Yeoh, Yu Xuan, Foo, Nicholas Yong Wei, Radhakrishnan, Keerthana, Lum, Guo Zhan
Miniature robots are untethered actuators, which have significant potential to make existing minimally invasive surgery considerably safer and painless, and enable unprecedented treatments because they are much smaller and dexterous than existing surgical robots. Of the miniature robots, the magnetically actuated ones are the most functional and dexterous. However, existing magnetic miniature robots are currently impractical for surgery because they are either restricted to possessing at most two on-board functionalities or having limited five degrees-of-freedom (DOF) locomotion. Some of these actuators are also only operational under specialized environments where actuation from strong external magnets must be at very close proximity (< 4 cm away). Here we present a millimeter-scale soft robot where its magnetization profile can be reprogrammed upon command to perform five surgical functionalities: drug-dispensing, cutting through biological tissues (simulated with gelatin), gripping, storing (biological) samples and remote heating. By possessing full six-DOF motions, including the sixth-DOF rotation about its net magnetic moment, our soft robot can also roll and two-anchor crawl across challenging unstructured environments, which are impassable by its five-DOF counterparts. Because our actuating magnetic fields are relatively uniform and weak (at most 65 mT and 1.5 T/m), such fields can theoretically penetrate through biological tissues harmlessly and allow our soft robot to remain controllable within the depths of the human body. We envision that this work marks a major milestone for the advancement of soft actuators, and towards revolutionizing minimally invasive treatments with untethered miniature robots that have unprecedented functionalities.
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Diagnostic Medicine (0.92)
- Materials > Chemicals (0.88)
- Health & Medicine > Nuclear Medicine (0.67)
Neural Combinatorial Optimization for Real-World Routing
Son, Jiwoo, Zhao, Zhikai, Berto, Federico, Hua, Chuanbo, Kwon, Changhyun, Park, Jinkyoo
Vehicle Routing Problems (VRPs) are a class of NP-hard problems ubiquitous in several real-world logistics scenarios that pose significant challenges for optimization. Neural Combinatorial Optimization (NCO) has emerged as a promising alternative to classical approaches, as it can learn fast heuristics to solve VRPs. However, most research works in NCO for VRPs focus on simplified settings, which do not account for asymmetric distances and travel durations that cannot be derived by simple Euclidean distances and unrealistic data distributions, hindering real-world deployment. This work introduces RRNCO (Real Routing NCO) to bridge the gap of NCO between synthetic and real-world VRPs in the critical aspects of both data and modeling. First, we introduce a new, openly available dataset with real-world data containing a diverse dataset of locations, distances, and duration matrices from 100 cities, considering realistic settings with actual routing distances and durations obtained from Open Source Routing Machine (OSRM). Second, we propose a novel approach that efficiently processes both node and edge features through contextual gating, enabling the construction of more informed node embedding, and we finally incorporate an Adaptation Attention Free Module (AAFM) with neural adaptive bias mechanisms that effectively integrates not only distance matrices but also angular relationships between nodes, allowing our model to capture rich structural information. RRNCO achieves state-of-the-art results in real-world VRPs among NCO methods. We make our dataset and code publicly available at https://github.com/ai4co/real-routing-nco.
- Asia > East Asia (0.05)
- Europe > Northern Europe (0.05)
- Asia > Southeast Asia (0.05)
- (80 more...)
Knowledge Graph-Guided Retrieval Augmented Generation
Zhu, Xiangrong, Xie, Yuexiang, Liu, Yi, Li, Yaliang, Hu, Wei
Retrieval-augmented generation (RAG) has emerged as a promising technology for addressing hallucination issues in the responses generated by large language models (LLMs). Existing studies on RAG primarily focus on applying semantic-based approaches to retrieve isolated relevant chunks, which ignore their intrinsic relationships. In this paper, we propose a novel Knowledge Graph-Guided Retrieval Augmented Generation (KG$^2$RAG) framework that utilizes knowledge graphs (KGs) to provide fact-level relationships between chunks, improving the diversity and coherence of the retrieved results. Specifically, after performing a semantic-based retrieval to provide seed chunks, KG$^2$RAG employs a KG-guided chunk expansion process and a KG-based chunk organization process to deliver relevant and important knowledge in well-organized paragraphs. Extensive experiments conducted on the HotpotQA dataset and its variants demonstrate the advantages of KG$^2$RAG compared to existing RAG-based approaches, in terms of both response quality and retrieval quality.
- North America > United States > District of Columbia > Washington (0.14)
- North America > United States > New York (0.04)
- Asia > Thailand > Bangkok > Bangkok (0.04)
- (21 more...)
- Leisure & Entertainment (1.00)
- Media > Film (0.94)
Evaluating Self-Generated Documents for Enhancing Retrieval-Augmented Generation with Large Language Models
Li, Jiatao, Hu, Xinyu, Yin, Xunjian, Wan, Xiaojun
The integration of documents generated by LLMs themselves (Self-Docs) alongside retrieved documents has emerged as a promising strategy for retrieval-augmented generation systems. However, previous research primarily focuses on optimizing the use of Self-Docs, with their inherent properties remaining underexplored. To bridge this gap, we first investigate the overall effectiveness of Self-Docs, identifying key factors that shape their contribution to RAG performance (RQ1). Building on these insights, we develop a taxonomy grounded in Systemic Functional Linguistics to compare the influence of various Self-Docs categories (RQ2) and explore strategies for combining them with external sources (RQ3). Our findings reveal which types of Self-Docs are most beneficial and offer practical guidelines for leveraging them to achieve significant improvements in knowledge-intensive question answering tasks.
- Asia > Russia (1.00)
- Europe > Poland > Masovia Province > Warsaw (0.04)
- Europe > Eastern Europe (0.04)
- (14 more...)
- Transportation > Air (1.00)
- Media > Film (1.00)
- Leisure & Entertainment (1.00)
- (5 more...)
CHisIEC: An Information Extraction Corpus for Ancient Chinese History
Tang, Xuemei, Deng, Zekun, Su, Qi, Yang, Hao, Wang, Jun
Natural Language Processing (NLP) plays a pivotal role in the realm of Digital Humanities (DH) and serves as the cornerstone for advancing the structural analysis of historical and cultural heritage texts. This is particularly true for the domains of named entity recognition (NER) and relation extraction (RE). In our commitment to expediting ancient history and culture, we present the ``Chinese Historical Information Extraction Corpus''(CHisIEC). CHisIEC is a meticulously curated dataset designed to develop and evaluate NER and RE tasks, offering a resource to facilitate research in the field. Spanning a remarkable historical timeline encompassing data from 13 dynasties spanning over 1830 years, CHisIEC epitomizes the extensive temporal range and text heterogeneity inherent in Chinese historical documents. The dataset encompasses four distinct entity types and twelve relation types, resulting in a meticulously labeled dataset comprising 14,194 entities and 8,609 relations. To establish the robustness and versatility of our dataset, we have undertaken comprehensive experimentation involving models of various sizes and paradigms. Additionally, we have evaluated the capabilities of Large Language Models (LLMs) in the context of tasks related to ancient Chinese history. The dataset and code are available at \url{https://github.com/tangxuemei1995/CHisIEC}.
- Asia > China (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- Asia > Myanmar > Mandalay Region > Mandalay (0.04)
- (7 more...)
Repetition Improves Language Model Embeddings
Springer, Jacob Mitchell, Kotha, Suhas, Fried, Daniel, Neubig, Graham, Raghunathan, Aditi
Recent approaches to improving the extraction of text embeddings from autoregressive large language models (LLMs) have largely focused on improvements to data, backbone pretrained language models, or improving task-differentiation via instructions. In this work, we address an architectural limitation of autoregressive models: token embeddings cannot contain information from tokens that appear later in the input. To address this limitation, we propose a simple approach, "echo embeddings," in which we repeat the input twice in context and extract embeddings from the second occurrence. We show that echo embeddings of early tokens can encode information about later tokens, allowing us to maximally leverage high-quality LLMs for embeddings. On the MTEB leaderboard, echo embeddings improve over classical embeddings by over 9% zero-shot and by around 0.7% when fine-tuned. Echo embeddings with a Mistral-7B model achieve state-of-the-art compared to prior open source models that do not leverage synthetic fine-tuning data.
- Asia > Singapore (0.04)
- Asia > Myanmar > Mandalay Region > Mandalay (0.04)
- Asia > Middle East > UAE (0.04)
- (3 more...)
- Media (0.46)
- Health & Medicine (0.46)
- Leisure & Entertainment (0.46)
Do LLMs Know about Hallucination? An Empirical Investigation of LLM's Hidden States
Duan, Hanyu, Yang, Yi, Tam, Kar Yan
Large Language Models (LLMs) can make up answers that are not real, and this is known as hallucination. This research aims to see if, how, and to what extent LLMs are aware of hallucination. More specifically, we check whether and how an LLM reacts differently in its hidden states when it answers a question right versus when it hallucinates. To do this, we introduce an experimental framework which allows examining LLM's hidden states in different hallucination situations. Building upon this framework, we conduct a series of experiments with language models in the LLaMA family (Touvron et al., 2023). Our empirical findings suggest that LLMs react differently when processing a genuine response versus a fabricated one. We then apply various model interpretation techniques to help understand and explain the findings better. Moreover, informed by the empirical observations, we show great potential of using the guidance derived from LLM's hidden representation space to mitigate hallucination. We believe this work provides insights into how LLMs produce hallucinated answers and how to make them occur less often.
- South America > Peru (0.04)
- Asia > China > Anhui Province (0.04)
- Asia > Middle East > Jordan (0.04)
- (11 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.97)
- Media > Film (1.00)
- Leisure & Entertainment > Sports > Football (1.00)